Reducing Lexical Redundancy by Augmenting Conceptual Knowledge
نویسندگان
چکیده
Lexical (and structural) ambiguities make language as expressive as it is. Computational lexicons thus have to cope with a large amount of polysemous words. Research in the last decade (e. g., Pustejovsky (1995), Kilgarriff and Gazdar (1995)) has aimed at identifying different types of polysemy in order to capture underlying regularities. This paper deals with a subtype of regular polysemy illustrated by the following sentences: (1) Mrs. Richwoman donated a significant amount to the opera (institution). (2) The opera (building) was built by a famous architect. (3) The opera (staff) was on strike yesterday. (4) The opera (contents) features a Russian Tsar. (5) I enjoyed the opera (performance) very much. (6) I bought the opera (recording) at WOM’s. As you can see the English word opera stands for a range of concepts which are closely interrelated. Bierwisch (1983) introduces the name Konzeptfamilie (concept family) for this range of possible interpretations of lexemes denoting institutions or cultural performances.1 We call this kind of polysemy inherent polysemy because a single occurrence of a lexeme of this special subtype of regular polysemy can simultaneously be interpreted as different variants2. A variant then, in contradistinction to a reading of other types of polysemes, need not be the exclusive interpretation of the polysemous word: (7) I enjoyed very much the opera (performance/contents) featuring a Tsar yesterday in the local opera house. A member of the concept family may be lexicalized as a separate lexeme, like the building variant of the polyseme city council: it is lexicalized as town hall. This seems to imply that the members of the family have conceptual status of their own. The aim of this paper, however, is not to postulate (cognitively adequate) concepts. We have a rather practical goal in view: the construction of a lexicon3 where knowledge is located at (and inherited from) the most general place in the hierarchy possible. We group concepts into a concept family whenever this family is lexicalized by an inherently polysemous lexeme. In order to minimize redundancy and maximize reuse of information, especially when building language lexicons for several languages, one should maximize the information stored in the concept lexicon and thereby minimize the information in the language lexicon. Although concept lexicon and language lexicons are distinguished, both are unified in one (inheritance) structure. A combined treatment has been advocated for several reasons:
منابع مشابه
An Application of a Semantic Framework for the Analysis of Chinese Sentences
Analyzing the semantic representations of 10000 Chinese sentences and describing a new sentence analysis method that evaluates semantic preference knowledge, we create a model of semantic representation analysis based on the correspondence between lexical meanings and conceptual structures, and relations that underlie those lexical meanings. We also propose a semantical argument-head relation t...
متن کاملA Large-Scale Semantic Structure for Chinese Sentences
Motivated by a systematic analysis of Chinese semantic relationships, we constructed a Chinese semantic framework based on surface syntactic relationships, deep semantic relationships and feature structure to express dependencies between lexical meanings and conceptual structures, and relations that underlie those lexical meanings. Analyzing the semantic representations of 10000 Chinese sentenc...
متن کاملRedundancy: Helping Semantic Disambiguation
Redundancy is a good thing, at least in a learning process. To be a good teacher you must say what you are going to say, say it, then say what you have just said. Well, three times is better than one. To acquire and learn knowledge from text for building a lexical knowledge base, we need to find a source of information that states t:acts, and repeats them a few times using slightly different se...
متن کاملSentiment Analysis by Augmenting Expectation Maximisation with Lexical Knowledge
Sentiment analysis of documents aims to characterise the positive or negative sentiment expressed in documents. It has been formulated as a supervised classification problem, which requires large numbers of labelled documents. Semi-supervised sentiment classification using limited documents or words labelled with sentiment-polarities are approaches to reducing labelling cost for effective learn...
متن کاملSubmitted to AAAI-96 Technological and Conceptual Tools for Lexical Knowledge Acquisition
The paper deals with the acquisition of static knowledge sources (ontology and the lexicons) for an NLP system. Acquiring the ontology and lexicon together enables immediate feedback between the two acquisition teams, involving shared semi-automatic acquisition tools. The acquisition tools can be technological (mostly corpus-processing and interface-related) and conceptual (such as the use of l...
متن کامل